Search CORE

87 research outputs found

CAFTAN: a tool for fast mapping, and quality assessment of cDNAs

Author: Val Muñoz María Coral Del
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2006
Field of study

Background: The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of truly useful cDNAs from biological and experimental noise. To this end we have developed a new high-throughput analysis tool, CAFTAN, which simplifies these efforts and thus fills the gap between large-scale cDNA collections and their systematic annotation and application in functional genomics. Results: CAFTAN is built around the mapping of cDNAs to the genome assembly, and the subsequent analysis of their genomic context. It uses sequence features like the presence and type of PolyA signals, inner and flanking repeats, the GC-content, splice site types, etc. All these features are evaluated in individual tests and classify cDNAs according to their sequence quality and likelihood to have been generated from fully processed mRNAs. Additionally, CAFTAN compares the coordinates of mapped cDNAs with the genomic coordinates of reference sets from public available resources (e.g., VEGA, ENSEMBL). This provides detailed information about overlapping exons and the structural classification of cDNAs with respect to the reference set of splice variants. The evaluation of CAFTAN showed that is able to correctly classify more than 85% of 5950 selected "known protein-coding" VEGA cDNAs as high quality multi- or single-exon. It identified as good 80.6 % of the single exon cDNAs and 85 % of the multiple exon cDNAs. The program is written in Perl and in a modular way, allowing the adoption of this strategy to other tasks like EST-annotation, or to extend it by adding new classification rules and new organism databases as they become available. We think that it is a very useful program for the annotation and research of unfinished genomes. Conclusion: CAFTAN is a high-throughput sequence analysis tool, which performs a fast and reliable quality prediction of cDNAs. Several thousands of cDNAs can be analyzed in a short time, giving the curator/scientist a first quick overview about the quality and the already existing annotation of a set of cDNAs. It supports the rejection of low quality cDNAs and helps in the selection of likely novel splice variants, and/or completely novel transcripts for new experiments.German Federal Ministry of Education and Research 01GR0101 and 01GR0420 and 01GR045

Repositorio Institucional Universidad de Granada

Rhomboid Protease Dynamics and Lipid Interactions

Author: Bondar Ana Nicoleta
Val Muñoz María Coral Del
Publication venue: 'Elsevier BV'
Publication date: 11/03/2009
Field of study

Intramembrane proteases, which cleave transmembrane (TM) helices, participate in numerous biological processes encompassing all branches of life. Several crystallographic structures of Escherichia coli GlpG rhomboid protease have been determined. In order to understand GlpG dynamics and lipid interactions in a native-like environment, we have examined the molecular dynamics of wild-type and mutant GlpG in different membrane environments. The irregular shape and small hydrophobic thickness of the protein cause significant bilayer deformations that may be important for substrate entry into the active site. Hydrogen-bond interactions with lipids are paramount in protein orientation and dynamics. Mutations in the unusual L1 loop cause changes in protein dynamics and protein orientation that are relayed to the His-Ser catalytic dyad. Similarly,mutations in TM5 change the dynamics and structure of the L1 loop. These results imply that the L1 loop has an important regulatory role in proteolysis.National Institute of General Medical Sciences (GM-74637

Repositorio Institucional Universidad de Granada

cDNA2Genome: A tool for mapping and annotating cDNAs

Author: del Val Coral
Glatting Karl-Heinz
Suhai Sandor
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: In the last years several high-throughput cDNA sequencing projects have been funded worldwide with the aim of identifying and characterizing the structure of complete novel human transcripts. However some of these cDNAs are error prone due to frameshifts and stop codon errors caused by low sequence quality, or to cloning of truncated inserts, among other reasons. Therefore, accurate CDS prediction from these sequences first require the identification of potentially problematic cDNAs in order to speed up the posterior annotation process. RESULTS: cDNA2Genome is an application for the automatic high-throughput mapping and characterization of cDNAs. It utilizes current annotation data and the most up to date databases, especially in the case of ESTs and mRNAs in conjunction with a vast number of approaches to gene prediction in order to perform a comprehensive assessment of the cDNA exon-intron structure. The final result of cDNA2Genome is an XML file containing all relevant information obtained in the process. This XML output can easily be used for further analysis such us program pipelines, or the integration of results into databases. The web interface to cDNA2Genome also presents this data in HTML, where the annotation is additionally shown in a graphical form. cDNA2Genome has been implemented under the W3H task framework which allows the combination of bioinformatics tools in tailor-made analysis task flows as well as the sequential or parallel computation of many sequences for large-scale analysis. CONCLUSIONS: cDNA2Genome represents a new versatile and easily extensible approach to the automated mapping and annotation of human cDNAs. The underlying approach allows sequential or parallel computation of sequences for high-throughput analysis of cDNAs

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Profile analysis and prediction of tissue-specific CpG island methylation classes

Author: del Val Coral
Harari Oscar
Previti Christopher
Zwir Igor
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The computational prediction of DNA methylation has become an important topic in the recent years due to its role in the epigenetic control of normal and cancer-related processes. While previous prediction approaches focused merely on differences between methylated and unmethylated DNA sequences, recent experimental results have shown the presence of much more complex patterns of methylation across tissues and time in the human genome. These patterns are only partially described by a binary model of DNA methylation. In this work we propose a novel approach, based on profile analysis of tissue-specific methylation that uncovers significant differences in the sequences of CpG islands (CGIs) that predispose them to a tissue- specific methylation pattern. Results We defined CGI methylation profiles that separate not only between constitutively methylated and unmethylated CGIs, but also identify CGIs showing a differential degree of methylation across tissues and cell-types or a lack of methylation exclusively in sperm. These profiles are clearly distinguished by a number of CGI attributes including their evolutionary conservation, their significance, as well as the evolutionary evidence of prior methylation. Additionally, we assess profile functionality with respect to the different compartments of protein coding genes and their possible use in the prediction of DNA methylation. Conclusion Our approach provides new insights into the biological features that determine if a CGI has a functional role in the epigenetic control of gene expression and the features associated with CGI methylation susceptibility. Moreover, we show that the ability to predict CGI methylation is based primarily on the quality of the biological information used and the relationships uncovered between different sources of knowledge. The strategy presented here is able to predict, besides the constitutively methylated and unmethylated classes, two more tissue specific methylation classes conserving the accuracy provided by leading binary methylation classification methods.</p

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

NORA - Norwegian Open Research Archives

Profile analysis and prediction of tissue-specific CpG island methylation classes

Author: del Val Coral
Harari Oscar
Previti Christopher
Zwir Igor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/08/2013
Field of study

Background: The computational prediction of DNA methylation has become an important topic in the recent years due to its role in the epigenetic control of normal and cancer-related processes. While previous prediction approaches focused merely on differences between methylated and unmethylated DNA sequences, recent experimental results have shown the presence of much more complex patterns of methylation across tissues and time in the human genome. These patterns are only partially described by a binary model of DNA methylation. In this work we propose a novel approach, based on profile analysis of tissue-specific methylation that uncovers significant differences in the sequences of CpG islands (CGIs) that predispose them to a tissuespecific methylation pattern. Results: We defined CGI methylation profiles that separate not only between constitutively methylated and unmethylated CGIs, but also identify CGIs showing a differential degree of methylation across tissues and cell-types or a lack of methylation exclusively in sperm. These profiles are clearly distinguished by a number of CGI attributes including their evolutionary conservation, their significance, as well as the evolutionary evidence of prior methylation. Additionally, we assess profile functionality with respect to the different compartments of protein coding genes and their possible use in the prediction of DNA methylation. Conclusion: Our approach provides new insights into the biological features that determine if a CGI has a functional role in the epigenetic control of gene expression and the features associated with CGI methylation susceptibility. Moreover, we show that the ability to predict CGI methylation is based primarily on the quality of the biological information used and the relationships uncovered between different sources of knowledge. The strategy presented here is able to predict, besides the constitutively methylated and unmethylated classes, two more tissue specific methylation classes conserving the accuracy provided by leading binary methylation classification methods.publishedVersionPeer Reviewe

University of Bergen

Cis-cop: Multiobjective identification of cis-regulatory modules based on constrains

Author: Martínez Ballesteros María del Mar
Romero Zaliz Rocío
Val Coral del
Zwir Igor
Publication venue: 'Universidad de Huelva - UHU'
Publication date: 01/01/2010
Field of study

Gene expression regulation is an intricate, dynamic phenomenon essential for all biolog ical functions. The necessary instructions for gen expression are encoded in cis-regulatory elements that work together and interact with the RNA polymerase to confer specific spatial and temporal patterns of transcrip tion. Therefore, the identification of these el ements is currently an active area of research in computational analysis of regulatory se quences. However, the problem is difficult since the combinatorial interactions between the regulating factors can be very complex. Here we present a web server, Cis-cop, that identifies cis-regulatory modules given a set of transcription factor binding sites and, ad ditionally, also RNA pol sites for a group of genes

idUS. Depósito de Investigación Universidad de Sevilla

Optimization of multi-classifiers for computational biology: application to gene finding and expression

Author: Romero Zaliz Rocío
Rubio Escudero Cristina
Val Coral del
Zwir Igor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Genomes of many organisms have been sequenced over the last few years. However, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed to address part of this problem: the location of genes along a genome and their expression. We propose a multi-objective methodology to combine state-of-the-art algorithms into an aggregation scheme in order to obtain optimal methods’ aggregations. The results obtained show a major improvement in sensitivity when our methodology is compared to the performance of individual methods for gene finding and gene expression problems. The methodology proposed here is an automatic method generator, and a step forward to exploit all already existing methods, by providing alternative optimal methods’ aggregations to answer concrete queries for a certain biological problem with a maximized accuracy of the prediction. As more approaches are integrated for each of the presented problems, de novo accuracy can be expected to improve further.Ministerio de Ciencia y Tecnología TIN2006-12879Junta de Andalucía TIC-0278

Springer - Publisher Connector

idUS. Depósito de Investigación Universidad de Sevilla

Uncovering the complex genetic architecture of human plasma lipidome using machine learning methods

Author: Lehtimäki Miikael
Val Muñoz María Coral Del
Zwir Nawrocki Jorge Sergio Igor
Publication venue: SpringerNature
Publication date: 22/02/2023
Field of study

Genetic architecture of plasma lipidome provides insights into regulation of lipid metabolism and related diseases. We applied an unsupervised machine learning method, PGMRA, to discover phenotype-genotype many-to-many relations between genotype and plasma lipidome (phenotype) in order to identify the genetic architecture of plasma lipidome profiled from 1,426 Finnish individuals aged 30–45 years. PGMRA involves biclustering genotype and lipidome data independently followed by their inter-domain integration based on hypergeometric tests of the number of shared individuals. Pathway enrichment analysis was performed on the SNP sets to identify their associated biological processes. We identified 93 statistically significant (hypergeometric p-value < 0.01) lipidomegenotype relations. Genotype biclusters in these 93 relations contained 5977 SNPs across 3164 genes. Twenty nine of the 93 relations contained genotype biclusters with more than 50% unique SNPs and participants, thus representing most distinct subgroups. We identified 30 significantly enriched biological processes among the SNPs involved in 21 of these 29 most distinct genotype-lipidome subgroups through which the identified genetic variants can influence and regulate plasma lipid related metabolism and profiles. This study identified 29 distinct genotype-lipidome subgroups in the studied Finnish population that may have distinct disease trajectories and therefore could be useful in precision medicine research.Research Council of FinlandSocial Insurance Institution of FinlandCompetitive State Research Financing of Expert Responsibility area of Kuopio, Tampere and Turku University HospitalsJuho Vainio FoundationPaavo Nurmi FoundationFinnish Foundation for Cardiovascular ResearchFinnish Cultural Foundation Finnish IT center for scienceSigrid Juselius FoundationTampere Tuberculosis FoundationEmil Aaltonen FoundationYrjo Jahnsson FoundationSigne and Ane Gyllenberg FoundationDiabetes Research Foundation of Finnish Diabetes Association 322098 286284 134309 126925 121584 124282 255381 256474 283115 319060 320297 314389 338395 330809 104821 129378 117797 141071 INFRAIA-2016-1-730897Horizon 2020European Research Council (ERC) European Commission 349708Tampere University Hospital Supporting FoundationFinnish Society of Clinical ChemistrySpanish Government RTI2018-098983-B-100Laboratoriolaaketieteen Edistamissaatio~SrIda Montinin saatioKalle Kaiharin saatioAarne Koskelon saatioFaculty of Medicine and Health Technology, Tampere UniversityProject HPC-EUROPA3 X51001 50191928EC Research Innovation Action under H2020 Programme 75532

Repositorio Institucional Universidad de Granada

Optimization of multi-classifiers for computational biology: application to gene finding and expression

Author: Romero Zaliz Rocio Celeste
Val Muñoz María Coral Del
Zwir Nawrocki Jorge Sergio Igor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2009
Field of study

Genomes of many organisms have been sequenced over the last few years. However, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed to address part of this problem: the location of genes along a genome and their expression. We propose a multi-objective methodology to combine state-of-the-art algorithms into an aggregation scheme in order to obtain optimal methods’ aggregations. The results obtained show a major improvement in sensitivity when our methodology is compared to the performance of individual methods for gene finding and gene expression problems. The methodology proposed here is an automatic method generator, and a step forward to exploit all already existing methods, by providing alternative optimal methods’ aggregations to answer concrete queries for a certain biological problem with a maximized accuracy of the prediction. As more approaches are integrated for each of the presented problems, de novo accuracy can be expected to improve further.Ministry of Science and Innovation, Spain (MICINN) Spanish Government TIN-2006-12879Junta de Andalucia TIC-02788Howard Hughes Medical InstituteEuropean Commission Junta de Andaluci

Repositorio Institucional Universidad de Granada

Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics

Author: Jiménez Zurdo José Ignacio
Toro Nicolás
Torres Quesada Omar
Val Muñoz María Coral Del
Publication venue: 'Wiley'
Publication date: 26/10/2007
Field of study

Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome- wide computational analysis of its intergenic regions. Comparative sequence data from eight related alpha-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5 '-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of alpha-proteobacteria with their eukaryotic hosts.Spanish Ministerio de Educación y Ciencia (Project AGL2006-12466/AGR)Junta de Andalucía (Project CV1-01522)NIH Grant 1R01GM070538-02FPI Fellowship from the Spanish Ministerio de Educación y Cienci

Repositorio Institucional Universidad de Granada